skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Widener, Patrick"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. null (Ed.)
    Large-scale, high-throughput computational science faces an accelerating convergence of software and hardware. Software container-based solutions have become common in cloud-based datacenter environments, and are considered promising tools for addressing heterogeneity and portability concerns. However, container solutions reflect a set of assumptions which complicate their adoption by developers and users of scientific workflow applications. Nor are containers a universal solution for deployment in high-performance computing (HPC) environments which have specialized and vertically integrated scheduling and runtime software stacks. In this paper, we present a container design and deployment approach which uses modular layering to ease the deployment of containers into existing HPC environments. This layered approach allows operating system integrations, support for different communication and performance monitoring libraries, and application code to be defined and interchanged in isolation. We describe in this paper the details of our approach, including specifics about container deployment and orchestration for different HPC scheduling systems. We also describe how this layering method can be used to build containers for two separate applications, each deployed on clusters with different batch schedulers, MPI networking support, and performance monitoring requirements. Our experience indicates that the layered approach is a viable strategy for building applications intended to provide similar behavior across widely varying deployment targets. 
    more » « less
  2. null (Ed.)
    Performance variation deriving from hardware and software sources is common in modern scientific and data-intensive computing systems, and synchronization in parallel and distributed programs often exacerbates their impacts at scale. The decentralized and emergent effects of such variation are, unfortunately, also difficult to systematically measure, analyze, and predict; modeling assumptions which are stringent enough to make analysis tractable frequently cannot be guaranteed at meaningful application scales, and longitudinal methods at such scales can require the capture and manipulation of impractically large amounts of data. This paper describes a new, scalable, and statistically robust approach for effective modeling, measurement, and analysis of large-scale performance variation in HPC systems. Our approach avoids the need to reason about complex distributions of runtimes among large numbers of individual application processes by focusing instead on the maximum length of distributed workload intervals. We describe this approach and its implementation in MPI which makes it applicable to a diverse set of HPC workloads. We also present evaluations of these techniques for quantifying and predicting performance variation carried out on large-scale computing systems, and discuss the strengths and limitations of the underlying modeling assumptions. 
    more » « less